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ABSTRACT 

Applying the standard weighted mean formula, [^- / [S; a ~ 2 i to determine 
the weighted mean of data, n,-, drawn from a Poisson distribution, will, on average, 
underestimate the true mean by for all true mean values larger than ~3 when the 
common assumption is made that the error of the it h observation is — ma.x( v /7i7, 1). 
This small, but statistically significant offset, explains the long-known observation 
that chi-square minimization techniques which use the modified Neyman s \ 2 statistic, 
\n = Yli( n i “ i/i) 2 / max(?q-, 1), to compare Poisson-distributed data with model 
values, iji , will typically predict a total number of counts that underestimates the 
true total by about 1 count per bin. Based on my finding that the weighted mean 
of data drawn from a Poisson distribution can be determined using the formula 
[E, [ n i + ni in («;, 1)] (n; + 1) _1 ] / [E, (»« + 1) _1 - 1 propose that a. new \ 2 statistic, 

Y 2 = [»i + min (?!,-, 1) - iji] 2 / [n, + 1], should always be used to analyze Poisson- 

distributed data in preference to the modified Neyman ’s \ 2 statistic. I demonstrate the 
power and usefulness of \ 2 minimization by using two statistical fitting techniques and 
five \ 2 statistics to analyze simulated X-ray power-law 15-channel spectra with large 
and small counts per bin. I show that \ 2 minimization with the Levenberg-Marquardt 
or Powell’s method can produce excellent results (mean slope errors <^3%) with spectra 
having as few as 25 total counts. 

Subject headings: methods: numerical — methods: statistical — X-rays: general 


1 NO AO is operated by the Association of Universities for Research in Astronomy, Inc., under cooperative 

agreement with the National Science Foundation. 
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1. INTRODUCTION 


The determination of the weighted mean is the fundamental problem for chi-square (\ 2 ) 
minimization methods. The good ness-of- fit between an observation of N data values, .t m with 
errors, < 7 t -, and a model, m.,, can be determined by using the standard chi-square statistic: 
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The theory of least-squares states that the optimum value of all the parameters of the model are 
obtained when the chi-square statistic is minimized with respect to each parameter simultaneously. 
For example, the standard formula of the weighted mean can be derived bv assuming that the 
model is a constant and then solving the equation, 
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for that constant: 
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The standard weighted-mean formula thus weights every data value, inversely by its own 
variance (i.e. of). 


Let us assume that all the data values come from a pure counting experiment where each 
data value, is a random integer deviate drawn from a Poisson (1837) distribution, 


= (4) 

with a mean value of//. Let us also make the common assumption that the error of each data value 
is the square root of the mean of the parent Poisson distribution. Using these transformations, 
=> iii and o { we see that Equation (3) becomes 
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which reduces to become the definition of the sample mean: 
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In the limit of a large number of observations of the Poisson distribution P(k;p), we find that 
Equation (6) will, on average, determine the mean of the parent Poisson distribution for all true 
mean values p: 
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Applying the standard weighted mean formula, 72,<7“ 2 j / a~ 2 J . to determine the 
weighted mean of data, iii, drawn from a Poisson distribution , u?/7/, on average , determine the 
mean of the parent Poisson distribution for all true mean values if a constant weight is assigned 
to all data values (i.e. a -2 = constant). 

It is a common practice to assume that the error of a Poisson deviate n is a = y/n . 
Unfortunately, this practice causes the standard weighted-mean formula to be undefined for data 
values of zero. A simple solution to this computational problem is to arbitrarily assign a non-zero 
constant error to all Poisson deviates with a value of zero. Let us make the common assumption 
that the error of each data value, n i , is equal to or 1 — whichever is greater. Using the 
following transformations, X{ =» 72, and a x max( v /n7, 1) , we see that Equation (3) becomes 
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In the limit of a large number of observations of the Poisson distribution P(k\[t), we find that 
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( 10 ) 


where Ei(a*) is the exponential integral of x, Ei(x) = - dt. = dt (for a; > 0), and 7 

is the Euler-Mascheroni constant: 7 = lim, ( _j, (>c , n} “ l n ( 7 0 = 0.5772156649 ** * (see, e.g., 

Abramowitz & Stegun 1964). 

Let us now investigate the limit of Equation (10) with large Poisson mean values. The 
transformation of Equation (9) to Equation (10) used the power series of Ei(x), 
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which has the following asymptotic expansion: 
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From the following limit, 

/ 1! 2! 3! V 1 

X \ l+ 7 + X* + ^ i + ''j ~ x =_1, ( 13 ) 

we see that Ei(a:) asymptotically approaches the function e x /(x - 1) for large values of a\ For 
x > 13 this approximation has an error of <1%; for x > 33 the error is <0.1%. In the limit of 
large mean Poisson values, we see that the numerator of Equation (10) is dominated by the 
term while the denominator is dominated by the Ei(/i) term which asymptotically approaches the 
value of ed L /(p — 1). We then have come to the surprising conclusion that for Poisson distributions 
with large mean values, lim/y-yoo [/xn] approaches the value of p - 1 instead of the expected value of 
/'■ 

Equation (10) can also be investigated graphically. Figure la plots the difference between 
the weighted mean computed using Equation (8) and the true mean for Poisson-distributed data 
with true mean values between 0.001 and 1000. Each open square represents the weighted mean 
of 4x 10 h Poisson deviates at each given true mean value. The solid curve through the data [open 
squares in Fig. la] is the difference between Equation (10) and the true mean. Note that Equation 
(10) underestimates the true mean by ~1 for large true mean values (as predicted above). 

Applying the standard weighted mean formula, /q< 7 ~ 2 J / a ~ 2 \ > *° determine the 

weighted mean of data , n t . drawn from a Poisson distribution , wilL on average . underestimate the 
true mean bg ~1 for all true mean values larger than ~3 when the common assumption is made 
that the error of the ith observation is a t ~ max( 1). 
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Figure lb graphically confirms this finding. Increasing the error estimates from max( v /n7, 1) to 
y/m + 1 has only yielded a minor improvement. Notice that the dip in the solid curve in Fig. 1a. 
at // ^ 6 is not present in the solid curve in Fig. lb. A more radical change appears to be required 
in order for us to develop a weighted-mean formula for Poisson-distributed data. 


Let us now add one to all data values and assume that the error of each data value is the 
square root of the new data value. Using these transformations, x t => /q + 1 and a, => \/n l + 1 , 
we see that Equation (3) becomes 
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In the limit of a large number of observations of the Poisson distribution P(k\fi), we find that 
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Figure ic graphically confirms this finding. We have now made significant progress towards our 
goal of developing a weighted-mean formula for Poisson-distributed data. Applying Equation (16) 
to determine the weighted mean of Poisson-distributed data, will, on average, estimate the true 
mean with <^1% errors for true Poisson mean values // ^ 5. 

The deviation of the solid curve in Figure lc. from zero can be eliminated by making just a 
minor change to our transformations. Using the same errors as above, a t => y/iii + 1 , but now 
adding one to only those data values that are initially greater than zero, n t + min(n;, 1) , we 

see that Equation (3) becomes 
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In the limit of a large number of observations of the Poisson distribution P(k\/.i), we find that 
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Figure Id graphically confirms this finding. We have now achieved our goal of developing a 
weighted-mean formula for Poisson-distributed data.. Applying Equation (IS) to determine the 
weighted mean of Poisson-distributed data, will, on average, estimate the true mean for all true 
Poisson mean values (p >0). 


3. THE y 2 STATISTIC 


Based on my finding that the weighted mean of data drawn 
be determined using the formula pi [ n i + min ( n i? 1)] i n i + 1) 1 
that, given N observations (n*) and a model (???,), a new \ 2 statistic, 
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should always be used to analyze Poisson-distributed data in preference to the modified Neyman s 
\ 2 statistic, 
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because the weighted-mean formula for the modified Neyman’s \ 2 statistic [//n* Equation (8)] 
systematically underestimates the true mean value of Poisson-distributed data with true mean 
values // ^ 0.5 (see Fig. la). 

For Poisson-distributed data, it has long been observed that, in many cases, chi-square fits 
using the modified Neyman’s \ 2 statistic and the Pearson’s \ 2 statistic, 
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will underestimate and overestimate the total area, respectively, while the usage of the maximum 
likelihood ratio statistic for Poisson distributions, 
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preserves the total area (e.g.. Baker & Cousins 1984 and references therein). 

It has been known for decades that chi-square minimization techniques using the modified 
Neyman's \ 2 statistic to analyze Poisson-distributed data will typically predict a total number 
of counts (total area) that underestimates the true total counts by about 1 count per bin (e.g., 
Bevington 1969, Wheaton et al. 1995). The reason why this underestimation occurs is now 
obvious: the application of the modified Neyman’s y 2 statistic to Poisson-distributed data causes 
the fitted model value at each bin, m,-, to be, on average, underestimated by ~1 count for all 
true Poisson model mean values ^3. The underestimation of the true mean by one count gives 
a very large 20% error when the true mean of the data is 5 but only a 1% error when the true 
mean of the data is 100. It would clearly be difficult to detect such a small systematic error with 
small samples of Poisson-distributed data with large true mean values. Figure 1a. shows that this 
underestimation is real and is easily measurable with large samples of Poisson-distributed data. 


The number of degrees of freedom, commonly represented with the symbol //, of a chi-square 
minimization problem is the difference between the number of observations (sample size) and the 
number of free parameters (A/) of the model: v = N — M . 

The reduced chi-square of the Pearson’s \ 2 statistic is, by definition, the value of Pearson’s 
\ 2 statistic divided by the number of degrees of freedom: 


\P 


i A r /. \ 2 

1 ^ ( n t — nii) 


N - M 1 


i=i 


m, ; 


(24) 


On average, the expected reduced chi-square value of a proper y 2 statistic with a perfect model 
is one — given a large number of observations. Now let us assume that our data comes from a 
Poisson distribution with a mean value of p. In this case, the model m t will be a constant, gp 
[Equation (5)], which will, on average, have a value, gp>, given by Equation (7) in the limit of 
a large number of observations (N.B. ppt = p). The model is a constant and therefore there is 
only one degree-of-freedom: M = 1. Given these assumptions, we find that, in the limit of a large 
number of observations, the reduced chi-square of the Pearson’s \ 2 statistic with the model gp is 
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The reduced chi-square of the modified Neyman's \ 2 statistic is, by definition, the value of 
the modified Neyman’s \ 2 statistic divided by the number of degrees of freedom: 
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Now let us assume that our data comes from a Poisson distribution with a mean value of /i. In this 
case, the model m,- will be a constant, /q\ [Equation (8)], which will, on average, have a value, 
given by Equation (10) in the limit of a. large number of observations. Given these assumptions, 
we find that, in the limit of a. large number of observations, the reduced chi-square of the modified 
Neyman’s \ 2 statistic with the model /qv is 
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In the limit of large mean Poisson values, we see that the numerator of the first term of Equation 
(27) is dominated by the term while the denominator of the first term is dominated by the 
Ei(/i) term which asymptotically approaches the value of e /4 /(/i - 1). We then conclude that the 
reduced chi-square of the statistic applied to a Poisson distribution [Equation (27)] approaches 
the value of one for large true Poisson mean values. Figure 2 graphically confirms this finding; 
we see that Equation (27) reaches a value of ~1 only for very large true Poisson mean values 
(F^lOO). 

The reduced chi-square of the new x 7 statistic is, by definition, the value of the \ 2 statistic 
divided by t he number of degrees of freedom: 
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Now let us assume that our data comes from a Poisson distribution with a mean value of ji. In 
this case, the model ??i { - will be a constant, fu t [Equation (18)], which will, on average, have a 
value, /v, given by Equation (19) in the limit of a large number of observations (N.B. /ty = fi). 
Given these assumptions, we find that, in the limit of a large number of observations, the reduced 
chi-square of the new \ 2 statistic with the model is 
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Figure 2 shows that the reduced chi-square of the \ 2 statistic applied to a Poisson distribution 
[Equation (29)] approaches the value of one for small true Poisson mean values (i.e. /j ;> 7). 



Figure 3 shows the variance of the reduced chi-square of the \p, Xn> X 71 and X\ statistics as 
a function of the true Poisson mean. This figure was derived by analyzing the data used in Figure 
1 . 


4 . SIMULATED X-RAY POWER-LAW SPECTRA 


I now demonstrate the new statistic by using it to study a dataset of simulated X-ray 
power-law spectra. This dataset is based on my duplication of the simple numerical experiment 
of Nousek & Shue (1989). The number of X-ray photons per energy interval (bin) of a X-ray 
power-law spectrum is 

dN = NoE'^dE . (30) 


Over an energy range, E min < E < E 
can be determined as follows 


max keV, the expectation value for the total number of counts 
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Following Nousek & Shue, I chose the slope value of 7 =E 2.0 and used the energy range of 
0.095-0.845 keV which was split into 15 equal bins of 0.050 keV per bin. I simulated 10 4 X-ray 
spectra for each of the theoretical N values used by Nousek & Shue: 25, 50, 75, 100, 150, 250, 500, 
750, 1000, 2500, 5000, and 10 4 photons per spectrum. Figure 4 shows four of the simulated X-ray 
power-law spectra. 


4.1. Powell’s Method: Solving for 7 and N using \^, \f>, X 7 

I determined the best-fit model parameters 7 ca i c and A r ca | c for each simulated spectrum with 
Powell’s function minimization method 2 using the modified Neyman’s \ 2 statistic (\^j), Pearson s 
\ 2 statistic (\p), and the new \ 2 statistic. I used the following crude initial guesses: 7 = 0.0 and 
N = 1.3 ^j 5 n,', where n t - is the observed number of photons in the it h channel (bin). I computed 
the robust mean (average) and robust standard deviation 3 of the ratios 7 ca i c /7 an d N ca \ c /N for the 
10 4 simulated spectra of each dataset. The results of Powell’s method with two free parameters 
( 7 , A r ) using the \ 2 T , Xpo X* statistics are presented in Table 1 and Figure 5 . The first column, 


2 The primary reference for Powell’s minimization method is Powell (1964). More accessible descriptions may be 
found in the numerical- methods literature (e.g., Acton 1970, Gill, Murray &: Wright 1981, and Press et al. 1986) 

3 The robust mean given in all the tables is the mean of all values within two average deviations of the standard 
mean value. The robust standard deviation given in all the tables is 1.55 <t where <7 is the standard deviation of all 
values within two average deviat ions of the standard mean values. 


A r , of Table 1 corresponds to the total theoretical number of counts in the spectrum. The columns 
“ 7 calc/ 7 ” and u A r ca i c /AT’ are the robust mean values of the ratios of the best-fit parameters 
divided by the original value that was used to create the datasets. The parenthetical numbers are 
the robust standard deviations which can be used to determine the significance of the deviation 
from the perfect ratio value of one. For example, the first value of the 2nd column of Table 1 is 
1.002(11) which represents the value of 1.002±0.011. The deviation of this value from one (i.e. 
0.002) is statistically significant since the error of the mean is only ~0.011 /x/lO 4 or ~0 .000 1 1 . 

Figure 5 indicates that the new y 2 statistic gives the best results. Using a 5% criteria for 
both fitted parameters ( 7 , A r ), we see that the y 2 statistic gives good results for spectra with 
£50 photons. By comparison, Pearson’s \ 2 statistic requires £250 photons and the modified 
Ney man’s \ 2 statistic requires £750 photons in order to get the same quality of results. Baker & 
Cousins (1984) noted that, in many cases, \ 2 fits using the the modified Neyman’s y 2 statistic 
will underestimate the total number of counts while y 2 fits using Pearson’s y 2 statistic will 
overestimate the total number of counts; both systematic errors are clearly seen in the bottom 
panel of Figure 5. 1 stated in the previous section that the usage of the modified Neyman’s \ 2 
statistic with Poisson-distributed data will typically underestimate the total counts by one count 
per bin. My results for the statistic clearly exhibit this systematic error: the results of the 
ratio ;V ca ic/A r for spectra with N £ 250 photons (squares in the bottom panel of Fig. 5) are well 
modeled by the function (N - 15) /N where 15 is the number of bins (channels) in our spectra [see 
the dashed curve in the bottom panel of Fig. 5]. 

A comparison of my analysis of 7 ca lc/7 using the modified Neyman’s y 2 statistic ( 2 nd column 
of Table 1 ) with the analysis of Nousek & Shue for Powell’s method ( 3 rd column of their Table 
3) shows nearly identical results. In my version of this numerical experiment, 1 used the two 
parameters A and 7 while Nousek & Shue used Ao and 7 . A comparison of my analysis of N ca \ c /N 
(3rd column of Table 1 ) with their Powell’s method analysis of N C3l \ c /Nq ( 2 nd column of their 
Table 3) shows that my analysis with N ca \ c /N has produced better estimates. This should not be 
surprising because the parameter No is not an independent parameter - Ao depends on both the 
slope of the spectrum and the theoretical number of photons in the spectrum. As a general rule, 
one gets better results by solving for independent parameters instead of dependent parameters. 


4.2. Levenberg-Marquardt Method: Solving for 7 and N using y ^ , yp, y 2 

1 determined the best-fit model parameters 7 ca | c and A r ca [ c for each simulated spectrum with 
Levenberg-Marquardt method 4 using the modified Neyman’s y 2 statistic (yjsj), Pearson’s y 2 
statistic (\p), and the new y 2 statistic. 1 used the previous crude initial guesses: 7 = 0.0 and 


'The primary references for Levenberg-Marquardt method are Levenberg (1944) and Marquardt (1963). More 
accessible descriptions may be found in the numerical- methods literature (e.g., Bevington 1969, Gill, Murray <k: 
Wright 1981, and Press el al. 1986) 
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A r = 1-3 X ]- 5 ri;. I computed the robust mean and robust standard deviation of the ratios 7 ca lc/? 
and N cii \ c /N for the 10 4 simulated spectra of each dataset. The results of Levenberg-Marquardt 
method with two free parameters ( 7 , N) using the \p , \ 2 statistics are presented in Table 2 
and Figure 6 . 

Figure 6 indicates that the new \ 2 statistic gives the best results. Using a 5% criteria for 
both fitted parameters ( 7 , A r ), we see that the x 2 statistic gives good results for all the spectra 
(A r ^ 25 photons). By comparison, Pearson’s \ 2 statistic requires ^100 photons and the modified 
Neyman’s \ 2 statistic requires ^500 photons in order to get the same quality of results. 

The results for the \ 2 and statistics are nearly identical with either Powell's method (Table 
1 ) or the Levenberg-Marquardt method (Table 2). This finding refutes the determination by Nousek 
& Sliue (1989) that Powell’s method gives more accurate results than the Levenberg-Marquardt 
method. 

The results for Pearson’s \ 2 improved significantly by using the Levenberg-Marquardt 
method instead of Powell’s method. An inspection of the individual fits showed that the 
Levenberg-Marquardt method with the \p statistic produced a best-fit value for N that was 
within a one-tenth of one percent of the total number of photons in the spectrum. Needless to 
say, with such an improvement in the determination of A r , a much better estimate for the slope 7 
could be determined. 

This peculiar result tells us something important about this particular minimization problem: 
an excellent estimate of the total number of photons in the best-fit spectrum is the total number 
of photons in the actual spectrum. Thus by setting N to be a constant, N = we can 

eliminate one parameter and solve for 7 alone. 


4.3. PowelPs Method: Solving for 7 using \p, \ 2 

I determined the best-fit model parameter 7 ca i c for each simulated spectrum with Powell’s 
function minimization method using the modified Neyman’s \ 2 statistic (\jv), Pearson’s \ 2 
statistic (\p), and the new \ 2 statistic. I set N = rii and used the crude initial guess of 
7 = 0 . 0 . I computed the robust mean and robust standard deviation of the ratios 7 ca lc/7 for the 
10 4 simulated spectra of each dataset. The results of Powell’s method with two free parameters 
( 7 , N) using the xjv > Xpi \ /2 statistics are presented in Table 3 and Figure 7 . 

Figure 7 indicates that the new \ 2 statistic gives the best results. Losing a 5% criteria, we 
see that the x /2 statistic gives good results for all the spectra (N 25 photons). By comparison, 
Pearson’s \ 2 statistic requires ^250 photons and the modified Neyman’s \ 2 statistic requires 
^750 photons in order to get the same quality of results. 

Fitting only for the slope 7 has improved the results for the new \ 2 statistic and the modified 
Neyman’s \ 2 statistic. The results for Pearson’s \ 2 show no improvement over the two free 
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parameter result. 


4.4. Levenberg-Marquardt Method: Solving for 7 using \p, y 2 

I determined the best-fit model parameter 7 ca j c for each simulated spectrum with the 
Levenberg-Marquardt minimization method using the modified Neyman’s x 2 statistic (x^), 
Pearson’s \ 2 statistic (xp), and the new \ 2 statistic. I set N = and used the crude initial 

guess of 7 = 0.0. 1 computed the robust mean and robust standard deviation of the ratios 7 ca lc/7 
for the 10 4 simulated spectra, of each dataset. The results of the Levenberg-Marquardt method 
with one free parameter (7) using the \p, \ 2 statistics are presented in Table 4 and Figure 8 


Figure 8 indicates that Pearson’s \ 2 statistic gives the best results. Using a 5% criteria, 
we see that both the new \ 2 statistic and the Xp statistic give good results for all the spectra 
(A T ;> 25 photons). By comparison, the modified Neyman’s \ 2 statistic still requires ^750 photons 
in order to get the same quality of results. Once again, we note that the results for the \ 2 and Xn 
statistics are nearly identical with either Powell’s method (Table 3) or the Levenberg-Marquardt 
method (Table 4). 


4.5. Error Estimates 

One expects the quality of the slope determination to degrade as the total number of photons 
in the X-ray spectra decline. Figure 9 shows the distribution of the best-fit values for the slope 7 
for the faintest spectra with a theoretical total of 100, 50, and 25 photons. As expected, the range 
of best-fit slope values measured for spectra with only N = 25 photons is considerably larger than 
the range of values for spectra with N = 100 photons. 

The Levenberg-Marquardt method not only provides best-fit values for parameters but 
it also provides an error estimate (approximately 1 a errors) of those fitted parameters. How 
believable are these error estimates? Figure 10 shows an analysis of the errors estimated by 
the Levenberg-Marquardt method when the new \ 2 statistic was used to analyze spectra with 
theoretical totals of 100, 50, and 25 photons. 

The top panel of Figure 10 shows the error analysis of spectra with N = 100 photons. The 
median slope value is 1.989 and the median error estimate is 0.194. A total of 15.87% of the 
spectra have estimates of 7 < 1.789 and 15.87% of the spectra have estimates of 7 > 2.211. 

For a normal distribution, one expects 68.26% of the deviates to be found within one standard 
deviation of the mean. Assuming that the distribution of best-fit 7 values approximates a normal 
distribution, then half of the difference between the 84.13 and 15.87 percentile values of 7 can be 
used as an estimate for the slope error: cr 7 % (784.13% - 7 is . 87%)/2 = (2.211 - 1.7S9)/2 - 0.211 . 
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This value is 8.8% larger than the median Levenberg-Marquardt error estimate; a fractional error 
of 10.6% instead of the predicted 9.8%). 

The middle panel of Figure 10 shows the error analysis of spectra with N = 50 photons. The 
median slope value is 2.009 and the median error estimate is 0.301. A total of 15.87% of the 
spectra have estimates of 7 < 1.732 and 15.87% of the spectra have estimates of 7 > 2.334. This 
gives an estimated slope error of « (784.13% ” 7 i 5.87%)/2 = 0.301 . This value is exactly equal 
to the median Levenberg-Marquardt error estimate. 

The bottom panel of Figure 10 shows the error analysis of spectra with N = 25 photons. 
The median slope value is 2.071 and the median error estimate is 0.484. A total of 15.87% of 
the spectra have estimates of 7 < 1.692 and 15.87% of the spectra have estimates of 7 > 2.570. 
This gives an estimated slope error of <r 7 ^ (7s4.i3% ~ 7 i5.s 7%) /2 = 0.439. This value is 9.3% less 
than the median Levenberg-Marquardt error estimate; a fractional error of 21.2%) instead of the 
predicted 23.4%. 

The errors estimated by the Levenberg-Marquardt method are seen to be reasonable. Figure 
11 shows the simulated X-ray spectra of Fig. 4 now plotted with \ 2 fits produced by the 
Levenberg-Marquardt method with one free parmater. The Levenberg-Marquardt method has 
done a good job even with the two faintest spectra which have actual totals of only 28 and 101 
photons. 


4.6. The \ 2 \ an d Cash’s C statistics 

For the sake of completeness, 1 determined the best-fit model parameter 7 ca j c for each 
simulated spectrum with Powell’s function minimization method using the maximum likelihood 
ratio statistic for Poisson distributions, [Equation (23)], and Cash’s C statistic, 

N 

C = 2^ [mi - rii In {mi)] (33) 

i= 1 

[Equation (6) of Cash 1979]. 1 set N = n x and used the crude initial guess of 7 = 0.0. 

I computed the robust mean and robust standard deviation of the ratios 7 ca lc/7 f° r the 10 4 
simulated spectra of each dataset. The results of Powell’s method with one free parameter (7) 
using the \\ statistic and Cash’s C statistic are presented in Table 5 and Figure 12 . 

Table 5 and the right panel of Figure 12 shows that Cash’s C statistic and the maximum 
likelihood ratio statistic for Poisson distributions, \^, give identical results. This is not surprising 
because Cash’s C statistic is a variant of the more well-known X\ statistic which has been 
discussed in the literature for over 70 years (e.g., Neyman & Pearson 1928). 

I also determined the best-fit model parameter 7 ca i c for each simulated spectrum with the 
Levenberg-Marquardt minimization method using the maximum likelihood ratio statistic for 
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Poisson distributions, y 2 . I set N = ?7 and used the crude initial guess of 7 — 0.0. 1 computed 

the robust average and robust standard deviation of the ratios 7 ca lc/7 for the 10 4 simulated 
spectra of each dataset. The results of the Levenberg-Marquardt method with one free parameter 
(7) using the y 2 statistic is presented in Table 6 and Figure 12 . The maximum likelihood 
ratio statistic for Poisson distributions, y 2 , produces nearly identical results with either Powell’s 
method or the Levenberg-Marquardt minimization method. 

Of the two statistics, y 2 and the new y 2 , which is better? Although Tables 6 and 4 indicate 
that the y 2 is slightly better, we see that the actual differences between the distributions presented 
in Figure 12 are really quite negligible when compared with the overall uncertainty caused bv 
simple sampling errors (counting statistics) of the simulated X-rav spectra. 

5 . SUMMARY 

I have demonstrated that the application of the standard weighted mean formula, 

n i a i~ 2 \ / [Zi°T 2 l *° ^ eterm ^ ne weighted mean of data, 77 , drawn from a Poisson 
distribution, will, on average, underestimate the true mean by for all true mean values 
larger than ~3 when the common assumption is made that the error of the 7th observation 
is <j t = max( v /n7, 1). This small, but statistically significant offset, explains the long-known 
observation that chi-square minimization techniques which use the modified Neyman’s y 2 statistic, 
\n = — l Ji) 2 / max(n,-, 1), to compare Poisson-distributed data with model values, 7/7 will 

typically predict a. total number of counts that underestimates the true total by about 1 count per 
bin. Based on my finding that the weighted mean of data drawn from a Poisson distribution can 
be determined using the formula Yi [ ? b + min (??;, 1)] (77 + l)’ 1 ] / (77 + l)” 1 , I proposed 

that a new y 2 statistic, y 2 = Yi + min (??.,* , 1) — y z ] 2 / [77 + 1], should always be used to analyze 
Poisson-distributed data in preference to the modified Neyman’s y 2 statistic. 

1 demonstrated the power and usefulness of \ 2 minimization by using two statistical fitting 
techniques (Powell’s method and the Levenberg-Marquardt method) and five \ 2 statistics (y^, \p , 
\ 2 , y 2 , and Cash’s C) to analyze simulated X-ray power-law 15-channel spectra with large and 
small counts per bin. I showed that y 2 minimization with the Levenberg-Marquardt or Powell’s 
method can produce excellent results (mean slope errors <^3%) with spectra having as few as 25 
total counts. 

This analysis shows that there is nothing inherently wrong with either the Levenberg- 
Marquardt method or Powelfs method in the low-count regime — provided that one uses 
an appropriate \ 2 statistic for the type of data being analyzed. Given Poisson-distributed 
data, one should always use the new y 2 statistic in preference to the modified Neyman’s y 2 
statistic because that statistic produces small, but statistically significant, systematic errors with 
Poisso n -d is t r i bu ted d at a, . 
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While the new statistic is not perfect, neither is the more well-known \\ statistic (e.g., 
see Figures 2 and 3). Both statistics have problems in the very-low-count regime. The new 
statistic complements but does not replace the older \ I 2 \ statistic. Which statistic is “best” will 
generally depend on the particular problem being analyzed. An important difference between 
these two statistics is that the \\ statistic assumes that all data is perfect. With data from perfect 
counting experiments, the \\ statistic may give slightly better results than the new statistic. 
However, data is typically obtained under less-than-perfect circumstances with multiple imperfect 
detectors. The statistic, by definition, is a weighted x 2 statistic which makes it easy to assign a 
lower weight to data from poor detectors. Thus in the analysis of real data obtained with noisy 
and imperfect detectors, the statistic may well outperform the classic \\ statistic because 
low-quality data can be given a lower weight instead of being completely rejected. 

Finally, I note in passing that two simple transformations may make it possible to retrofit 
many existing computer implementations (i.e. executable binaries) of minimization algorithms 
to do minimization through the simple expedient of changing the input data from [??;] to 
[n l + min (n t) 1)], and error estimates, cr\, from [max 1)] to [\/ni + 1 J . 
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true mean : /z 


Fig. 1. — Analysis of four weighted-mean formulae applied to Poisson-distributed data. Each open 
square represents the weighted mean of 4xl0 6 Poisson deviates at each given true mean value: 

0.001 < fi < 1000. 

(a) The difference between the weighted mean computed using Equation (8), an d 

true mean, /z . The solid curve is the difference between Equation (10) and the true mean: 
{ W L - 1][1 + Ei(/<) - 7 - ln(/t)] -1 } - /i. 

(b) The difference between the weighted mean computed using Equation (14), /i a , and the 
true mean, j.i . The solid curve is the difference between Equation (15) and the true mean: 

Ml - e~T l - 1} - /a. 

(c) The difference between the weighted mean computed using Equation (16), //£, and the 
true mean, fi . The solid curve is the difference between Equation (17) and the true mean: 

{/i[l - e - "]" 1 } - /'• 

(d) The difference between the weighted mean computed using Equation (18), fi~ n and the true 
mean, // . The solid curve is the difference between Equation (19) and the true mean. The difference 
is zero because fu ( , is the weighted-mean formula for Poisson-distributed data. 




- 21 - 



0.001 0.01 0.1 1 10 100 1000 

true mean : /z 


Fig. 2. — Reduced chi-square (y 2 /V) as a function of true Poisson mean, /x, for 4 y 2 statistics: 
Pearson’s \ 2 [x|> = («» “ m,) 2 /m 8 ], the modified Ney man’s \ 2 U'n = - 

mi) 2 / max(nj, 1)], the new y 2 statistic [y 2 = (n;+min(n,-, l)-m i ) 2 /(n,-(-l)] ) and the maximum 

likelihood ratio statistic for Poisson distributions [y 2 = 2^;^, (m, - n, + n t In (?q/m,))]. The 
Poisson distributions of Figure 1 were analyzed to produce this plot. The formula for the curve 
connecting the values for modified Neyman’s y 2 statistic (xjy) is given in Equation (27). The 
formula for the curve connecting the values for new y 2 statistic is given in Equation (29). The 
dotted line shows the ideal value of one. 




variance of reduced chi-square 
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Fig. 3. — The variance of the reduced chi-square (cr 2 2 j u ) as a function of true Poisson mean, /z, for 
5 \ 2 statistics: the standard \ 2 , Pearson’s \ 2 (\p), the modified Neyman’s \ 2 {\%), the new y 2 
statistic, and the maximum likelihood ratio statistic for Poisson distributions (\ 2 ). The Poisson 
distributions of Figure 1 were analyzed to produce this plot. The formula for the variance of the 
reduced Pearson’s \ 2 statistic is 2 + /z” 1 . The dotted line shows the ideal value of two. 




photons per channel 



keV 


Fig. 4. — The dashed lines show 4 ideal X-ray power-law spectra with a total of 25, 100, 1000, and 
10000 photons. Four simulated X-ray spectra with totals of 28, 101, 1015, and 9938 photons are 
shown with la error bars estimated with Equations (9) and (14) of Gehrels (1986). (N.B. Some 
errorbars overlap and the bottom two spectra have identical data values at the 0.47 and 0.67 keV 
bins 
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Fig. 5. — Results of Powell’s method with two free parameters ( 7 , N) for three statistics: 
(circles), \^ T (squares), and \p (triangles). This figure uses the data given in Table 1. The dotted 
lines show the ideal ratio value of one. The dashed curve in the bottom panel shows the function 
( ;Y — 15)/A r which is a good model for the N ca \ c /N results of the \y statistic for all spectra with 
N ^ 250 photons. 




Levenberg-Marquardt Method 


1000 


Fig. 6. Results of the Levenberg-Marquardt method with two free parameters (7, N) for three 
statistics: \? t (circles), \|j (squares), and \fs (triangles). This figure uses the data given in Table 
2. T. he dotted lines show the ideal ratio value of one. The dashed curve in the bottom panel shows 
the function (N - 15 )/N which is a good model for the N CH \ C /N results of the statistic for all 
spectra with N ^ 250 photons. 
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Levenberg-Marquardt Method 



Fig. 8 . — Results of the Levenberg-Marquardt method with one free parameter ( 7 ) for three 
statistics: (circles), (squares), and \p (triangles). This figure uses the data given in Table 

4. The dotted lines show the ideal ratio value of one. 
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Fig. 9. — A comparison of the results of the analysis of the simulated X-ray spectra with theoretical 
totals of 100, 50, and 25 photons using the Levenberg-Marquardt method with 1 free parameter 
(left) and Powell’s method with 1 free parameter (right). Note that the histograms for the and 
\£r are nearly identical for both methods. The statistical analysis of this data, is presented in Tables 
4 and 3. 
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0.8413 
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Fig. 10 . — Error analysis of the Levenberg-Marquardt method results using the statistic with 
one free parameter. The thick curve in each panel shows the cumulative distribution of the best-fit 
estimates of the slope 7 . The right (left) thin curve in each panel shows the cumulative distribution 
of 7 plus (minus) a-, which is the error estimate of the best-fit slope value. The statistical analysis 
of this data is presented in Table 4. 
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Fig. 11.— The simulated X-ray spectra of Fig. 4 now plotted with fits. The best fits are shown 
with solid curves. The upper and lower 1 a slope estimates are shown with long dashed curves. 
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Levenberg-Marquardt Method Powell’s Method 



Tcic/r 7c*tc/7 


Fig. 12. — A comparison of the results of the Levenberg-Marquardt method with 1 free parameter 
(left) and Powell’s method with 1 free parameter (right) for three statistics: Cash’s C. The 

statistical analysis of this data is presented in Tables 3, 5, and 6. 


number of spectra 




- 32 - 


Table 1. 

Results of Powell’s method with 2 free parameters ( 7 ,N) for 3 statistics: \ ^ , \p, Xp 



Icalc/^/ 

A r caic/A' 

Ocak/7 

A'aJc/A- 

Ocalc/^/ 

Acalc/ A r 

10000 .... 

1 . 002(1 1 ) 

0.999(12) 

0.999(11) 

1 . 000 ( 12 ) 

1 . 000 ( 11 ) 

1 000 ( 12 ) 

5000 

1.003(15) 

0.998(17) 

0.998(15) 

1.001(17) 

1.000(15) 

1.000(17) 

2500 

1.007(23) 

0.994(24) 

0.996(22) 

1.002(24) 

0.999(22) 

1.000(24) 

1000 

1.019(37) 

0.987(39) 

0.992(34) 

1.007(39) 

0.999(35) 

1.003(39) 

750 

1.025(45) 

0.981(45) 

0.989(39) 

1.008(43) 

0.998(41) 

1.003(44) 

500 

1.040(58) 

0.971(56) 

0.984(47) 

1.013(54) 

0.996(51) 

1.005(55) 

250 

1.071(82) 

0.946(79) 

0.969(67) 

1.025(77) 

0.992(80) 

1.009(81) 

150 

1.09(10) 

0.92(10) 

0.952(84) 

1.04(10) 

0.99(10) 

1 . 01 ( 11 ) 

100 

1.10(13) 

0.91(13) 

0.93(10) 

1.06(12) 

0.99(13) 

1.02(13) 

75 

1.11(15) 

0.89(14) 

0.92(12) 

1.07(14) 

0.99(15) 

1.03(15) 

50 

1 . 11 ( 20 ) 

0.87(17) 

0.89(14) 

1.10(18) 

0.99(19) 

1.04(19) 

25 

1.17(50) 

0.84(24) 

0.82(19) 

1.18(26) 

1.05(40) 

1.07(28) 
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Table 2. 

Results of the Levonberg-Marquardt method with 2 free parameters ( 7 ,A r ) for 3 statistics: x'n* \p> \' 2 



7calc/ 7 tVcalc/ A 7calc/7 ^calc/N 7calc/7 N C a\c/N 


10000 1.002(11) 0.999(12) 1.000(1 1) 1.000(12) 1.000(12) 1.000(11) 

5000 1.004(15) 0 998(17) 1.000(15) 1.000(17) 1.000(15) 1.000(17) 

2500 1.007(23) 0.994(24) 0.997(22) 1.000(24) 0.999(22) 1.000(24) 

1000 1.019(37) 0.987(39) 0.995(34) 1.000(38) 0.999(35) 1.003(39) 

750 1.025(45) 0 981(45) 0.995(38) 1.000(43) 0.998(41) 1.003(44) 

500 1.040(58) 0.971(56) 0.995(46) 1.000(54) 0.996(51) 1.005(55) 

250 1.071(82) 0.946(79) 0.994(67) 1.000(76) 0.992(80) 1.009(81) 

150 1.09(10) 0.92(10) 0.986(91) 0.994(97) 0.99(10) 1.01(11) 

100 1.10(13) 0.91(13) 0.96(12) 1.00(12) 0.99(13) 1.02(13) 

75 1.11(15) 0.89(14) 0.93(13) 1.00(14) 0.99(15) 1.03(15) 

50 1.11(20) 0.87(17) 0.90(14) 0.99(17) 0.99(19) 1.04(19) 

25 1.12(35) 0.84(24) 0.88(17) 0.99(23) 1.01(29) 1.07(28) 



Table 3. 

Results of Powell’s method with 1 free parameter (7) for 3 statistics 


N 

\N 

\p 

V? 

"7 calc /0 

IcnU'il 

Oc ale/ / 

10000 .... 

1.002(11) 

0.999(11) 

1.000(11) 

5000 

1.003(15) 

0.998(15) 

1.000(15) 

2500 

1.007(23) 

0.996(22) 

0.999(22) 

1000 

1.019(37) 

0.992(34) 

0.999(35) 

750 

1.025(45) 

0.989(39) 

0.998(41) 

500 

1.040(58) 

0.984(47) 

0.996(51) 

250 

1.070(82) 

0.969(67) 

0.992(80) 

150 

1.09(11) 

0.952(84) 

0.99(10) 

100 

1.10(13) 

0.93(10) 

0.99(13) 

75 

.. 1.11(15) 

0.92(12) 

1.00(15) 

50 

1.10(19) 

0.89(14) 

1.00(18) 

25 

. . 1.06(31) 

0.82(19) 

1 03(27) 
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Table 4. 

Results of the Levenberg-Marquardt method with 1 free parameter ( 7 ) for 3 statistics: \p, 


Oralc/7 4calc/7 7calc/7 


10000 1 . 002(1 1 ) 1 . 000 ( 11 ) 1 . 000 ( 11 ) 

5000 1.004(15) 1.000(15) 1.000(15) 

2500 1.007(23) 0.999(22) 0.999(22) 

1000 1.019(37) 0.994(34) 0.999(35) 

750 1.025(45) 0.992(39) 0.998(41) 

500 1.040(58) 0.990(48) 0.996(51) 

250 1.070(82) 0.988(68) 0.992(80) 

150 1.09(10) 0.988(87) 0.99(10) 

100 1.10(13) 0.99(11) 0.99(13) 

75 1.10(15) 0.99(12) 1.00(15) 

50 1.10(19) 0.99(16) 1.00(18) 

25 1.06(31) 0.99(22) 1.03(27) 



Table 5. 

Results of Powell’s method with 1 free parameter (7) for 2 statistics: \ \ . Cash 


N 

*5 

Cash's C 

O'calc/ *7 

T'calc/l 

10000 . . . . 

1.000(11) 

1 000(11) 

5000 

1.000(15) 

1.000(15) 

2500 

1.000(22) 

1.000(22) 

1000 

1.000(34) 

1.000(34) 

750 

1.000(30) 

1.000(39) 

500 

1.000(48) 

1.000(48) 

250 

0.999(68) 

0.999(68) 

150 

0.999(88) 

0.999(88) 

100 

1.00(11) 

1.00(11) 

75 

.. 1.00(12) 

1.00(12) 

50 

1.00(16) 

1.00(16) 

25 

1.00(22) 

1.00(22) 
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Table 6. 

Results of the Levon b erg- M a rq ua rd t method with I free parameter (7) for the \{ statistic 


N Ya 

7calc/l 

10000 .... 

1 . 000 ( 11 ) 

5000 

1.000(15) 

2500 

1.000(22) 

1000 

1.000(34) 

750 

1.000(39) 

500 

1.000(48) 

250 

0.999(68) 

150 

0.999(88) 

100 

1.00(11) 

75 

1.00(12) 

50 

1.00(16) 

25 

1.00(22) 



